{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "e83c2539",
   "metadata": {},
   "source": [
    "## Harnessing factor data using pandas_datareader"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f932e769",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install pandas_datareader"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "ec171920",
   "metadata": {},
   "outputs": [],
   "source": [
    "import warnings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "2ac3e1ac",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas_datareader as pdr\n",
    "from IPython.display import display"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "0d2f65cf",
   "metadata": {},
   "outputs": [],
   "source": [
    "warnings.filterwarnings(\"ignore\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "30423d49",
   "metadata": {},
   "source": [
    "Fetches the Fama-French research data factors and stores it in 'factors'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "4b9f5acb",
   "metadata": {},
   "outputs": [],
   "source": [
    "factors = pdr.get_data_famafrench(\n",
    "    \"F-F_Research_Data_Factors\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "e8a1418f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'F-F Research Data Factors\\n-------------------------\\n\\nThis file was created by CMPT_ME_BEME_RETS using the 202404 CRSP database. The 1-month TBill return is from Ibbotson and Associates, Inc. Copyright 2024 Kenneth R. French\\n\\n  0 : (59 rows x 4 cols)\\n  1 : Annual Factors: January-December (5 rows x 4 cols)'"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "display(factors[\"DESCR\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "289f62f0",
   "metadata": {},
   "source": [
    "Displays the first few rows of the first dataset in 'factors'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "40bc442f",
   "metadata": {},
   "outputs": [],
   "source": [
    "data = factors[0].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "3c08b717",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Mkt-RF</th>\n",
       "      <th>SMB</th>\n",
       "      <th>HML</th>\n",
       "      <th>RF</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2019-06</th>\n",
       "      <td>6.93</td>\n",
       "      <td>0.29</td>\n",
       "      <td>-0.71</td>\n",
       "      <td>0.18</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-07</th>\n",
       "      <td>1.19</td>\n",
       "      <td>-1.93</td>\n",
       "      <td>0.48</td>\n",
       "      <td>0.19</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-08</th>\n",
       "      <td>-2.58</td>\n",
       "      <td>-2.38</td>\n",
       "      <td>-4.78</td>\n",
       "      <td>0.16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-09</th>\n",
       "      <td>1.43</td>\n",
       "      <td>-0.96</td>\n",
       "      <td>6.75</td>\n",
       "      <td>0.18</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-10</th>\n",
       "      <td>2.06</td>\n",
       "      <td>0.29</td>\n",
       "      <td>-1.91</td>\n",
       "      <td>0.16</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         Mkt-RF   SMB   HML    RF\n",
       "Date                             \n",
       "2019-06    6.93  0.29 -0.71  0.18\n",
       "2019-07    1.19 -1.93  0.48  0.19\n",
       "2019-08   -2.58 -2.38 -4.78  0.16\n",
       "2019-09    1.43 -0.96  6.75  0.18\n",
       "2019-10    2.06  0.29 -1.91  0.16"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "display(data)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "67f824e3",
   "metadata": {},
   "source": [
    "Displays the first few rows of the second dataset in 'factors'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "aa37402f",
   "metadata": {},
   "outputs": [],
   "source": [
    "data = factors[1].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "64c1869d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Mkt-RF</th>\n",
       "      <th>SMB</th>\n",
       "      <th>HML</th>\n",
       "      <th>RF</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2019</th>\n",
       "      <td>28.28</td>\n",
       "      <td>-6.14</td>\n",
       "      <td>-10.46</td>\n",
       "      <td>2.15</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020</th>\n",
       "      <td>23.66</td>\n",
       "      <td>13.18</td>\n",
       "      <td>-46.67</td>\n",
       "      <td>0.45</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021</th>\n",
       "      <td>23.56</td>\n",
       "      <td>-3.89</td>\n",
       "      <td>25.49</td>\n",
       "      <td>0.04</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2022</th>\n",
       "      <td>-21.60</td>\n",
       "      <td>-6.95</td>\n",
       "      <td>25.81</td>\n",
       "      <td>1.43</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2023</th>\n",
       "      <td>21.70</td>\n",
       "      <td>-3.24</td>\n",
       "      <td>-13.60</td>\n",
       "      <td>4.95</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      Mkt-RF    SMB    HML    RF\n",
       "Date                            \n",
       "2019   28.28  -6.14 -10.46  2.15\n",
       "2020   23.66  13.18 -46.67  0.45\n",
       "2021   23.56  -3.89  25.49  0.04\n",
       "2022  -21.60  -6.95  25.81  1.43\n",
       "2023   21.70  -3.24 -13.60  4.95"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "display(data)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c4e33ca2",
   "metadata": {},
   "source": [
    "Fetches the Fama-French research data factors within the specified date range and stores it in 'factors'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "e3089efc",
   "metadata": {},
   "outputs": [],
   "source": [
    "factors = pdr.get_data_famafrench(\n",
    "    \"F-F_Research_Data_Factors\", start=\"2000-01-01\", end=\"2019-12-31\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "508a3e51",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{0:          Mkt-RF    SMB   HML    RF\n",
       " Date                              \n",
       " 2000-01   -4.74   5.77 -1.88  0.41\n",
       " 2000-02    2.45  21.36 -9.59  0.43\n",
       " 2000-03    5.20 -17.20  8.13  0.47\n",
       " 2000-04   -6.40  -6.68  7.26  0.46\n",
       " 2000-05   -4.42  -6.05  4.75  0.50\n",
       " ...         ...    ...   ...   ...\n",
       " 2019-08   -2.58  -2.38 -4.78  0.16\n",
       " 2019-09    1.43  -0.96  6.75  0.18\n",
       " 2019-10    2.06   0.29 -1.91  0.16\n",
       " 2019-11    3.87   0.77 -2.02  0.12\n",
       " 2019-12    2.77   0.73  1.75  0.14\n",
       " \n",
       " [240 rows x 4 columns],\n",
       " 1:       Mkt-RF    SMB    HML    RF\n",
       " Date                            \n",
       " 2000  -17.60  -4.60  44.98  5.89\n",
       " 2001  -15.21  18.16  18.52  3.83\n",
       " 2002  -22.76   4.38   8.10  1.65\n",
       " 2003   30.75  26.54   4.61  1.02\n",
       " 2004   10.72   4.46   7.61  1.20\n",
       " 2005    3.09  -2.36   9.41  2.98\n",
       " 2006   10.60   0.08  11.93  4.80\n",
       " 2007    1.04  -7.44 -17.18  4.66\n",
       " 2008  -38.34   2.40   1.05  1.60\n",
       " 2009   28.26   9.18  -9.65  0.10\n",
       " 2010   17.37  14.15  -5.17  0.12\n",
       " 2011    0.44  -5.73  -8.41  0.04\n",
       " 2012   16.27  -1.40   9.99  0.06\n",
       " 2013   35.20   7.73   2.60  0.02\n",
       " 2014   11.71  -7.80  -1.45  0.02\n",
       " 2015    0.09  -4.01  -9.65  0.02\n",
       " 2016   13.30   6.75  22.71  0.20\n",
       " 2017   21.51  -5.09 -13.57  0.80\n",
       " 2018   -6.95  -3.16  -9.67  1.83\n",
       " 2019   28.28  -6.14 -10.46  2.15,\n",
       " 'DESCR': 'F-F Research Data Factors\\n-------------------------\\n\\nThis file was created by CMPT_ME_BEME_RETS using the 202404 CRSP database. The 1-month TBill return is from Ibbotson and Associates, Inc. Copyright 2024 Kenneth R. French\\n\\n  0 : (240 rows x 4 cols)\\n  1 : Annual Factors: January-December (20 rows x 4 cols)'}"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "display(factors)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "095c43d5",
   "metadata": {},
   "source": [
    "**Jason Strimpel** is the founder of <a href='https://pyquantnews.com/'>PyQuant News</a> and co-founder of <a href='https://www.tradeblotter.io/'>Trade Blotter</a>. His career in algorithmic trading spans 20+ years. He previously traded for a Chicago-based hedge fund, was a risk manager at JPMorgan, and managed production risk technology for an energy derivatives trading firm in London. In Singapore, he served as APAC CIO for an agricultural trading firm and built the data science team for a global metals trading firm. Jason holds degrees in Finance and Economics and a Master's in Quantitative Finance from the Illinois Institute of Technology. His career spans America, Europe, and Asia. He shares his expertise through the <a href='https://pyquantnews.com/subscribe-to-the-pyquant-newsletter/'>PyQuant Newsletter</a>, social media, and has taught over 1,000+ algorithmic trading with Python in his popular course **<a href='https://gettingstartedwithpythonforquantfinance.com/'>Getting Started With Python for Quant Finance</a>**. All code is for educational purposes only. Nothing provided here is financial advise. Use at your own risk."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6a3b9759-1bad-4795-9ed7-832cf38819b7",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "jupytext": {
   "cell_metadata_filter": "-all",
   "main_language": "python",
   "notebook_metadata_filter": "-all"
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
